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FOREWCBD 


The following comments on handling and presenting statistical 
data are prompted by errors most commonly observed in reports sub- 
mitted for publication. Most readers will be familiar with the 
procedures outlined. If the present research aid reminds some 
analysts of fundamentals forgotten, or if it helps a few to sup- 
plement their research background, it will serve its purpose. 
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THE PRESENTATION OF STATISTICAL DATA 


I . Proofing . 

The most common errors in handling quantitative data are mistaJies 
in simple arithmetical computation, transposition of figures, or other 
faulty transcription of numbers. Theoretically, such errors never 
should occur. In actuality, they will creep into even the most care- 
fully done research. However, constant awareness of the danger of 
such errors, methodical research procedxire, and careful proofing will 
keep them to a minimum in the final product. A few rules for proofing, 
well known by all research workers, but too frequently forgotten in 
moments of haste or carelessness, follow; 

1. All data should be proofed caref\lLly following any transcription 
and again when the project is completed. 

2. Final proofing should be against original sources (whenever 
possible) to avoid perpetuation of treinscription errors that may have 
developed during the course of research. 

3- Proofing shoilLd be conducted by some one other than the person 
who did the original work , since it is possible to make the same misteike 
repeatedly . 

4. K one must proof one ' s own work , an attempt should be made 
to reverse processes , reading from copy to origineil, abiding from bottom 
to top columns originally added from top to bottom (or subtracting from 
the totals figures that originally were added), multiplying where 
division has been performed, and so forth. 

5 • Consideration of the general order of magnitude of figures 
is an essential part of the proofing process. Such consideration of 
the logic of figures, as presented, will guard against significant 
errors arising from failure to read a slide rule correctly or mis- 
placing a decimal point. 
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II* Slgnlflc8m.t Numbers * / 

1* Approximate Nature of Significant Numbers . 

Spiirlous accuraxjy, a common mistake In the handling of statis- 
tical data, too frequently Is not recognized as an error. Numbers 
used In abstract arithmetical work mean exactly what they say. The 
nvimber 7 (or 7.00 ... O) means exactly seven, no more and no less. 
Unfortunately, however, data used In most statistical computations are 
not so precise. Except In cases where an actual coxmt Is possible, 
data are, at best, measurements, the accuracy of which Is limited both 
by the quality of the Instrument and by the accxiracy of tihe observer.* 
Inevitably, some estimation Is Involved. 

The Rand McNally Reference and Road Atlas for 1950 states 
that the distance from Washington, D.C., to Philadelphia Is l4l miles. 
Timetables give the railroad distance between these two cities as 
135 miles. Although distances by automobile or railroad can be 
measured by instruments, the chance that either of the distances given 
is accurate to the last foot, or even to the nearest tenth of a mile, 
is very slight. The last digit in either case is an approximation. 
What the last digits signify is that the actual railroad distance lies 
somewhere between 134.5 and 135*5 miles, while the highway distance 
lies somewhere between l40.5 and l4l.5 miles. If one is interested 
only generally in the distance between the two cities (overlooking the 
fact that the distances by automobile and railroad are two different 
measurements), it wo\ad be adequate to say, on the basis of either 
measurement, that the two cities are approximately l4o miles apart, 
since both distances round to l40. In this case, only two digits 
are significant, and the second digit (4) is an approximation. The 
figure signifies that the true distance between Washington, D.C., 
Philadelphia lies somewhere between 135 miles and 145 miles. 

Significant terminal digits are approximations indicating 
that the true figure lies within plus or minus one-half of one unit 


* What follows concerning the approximate nature of measurements 
sho\ild not be confused with coimting. It is possible to count ac- 
curately that is, the exact number of people present in a room 
can be determined accurately by simple count. Such a datum, being 
accurate, may be expressed to any degree of accuracy compatible with 
companion data. 
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of the last digit . For example ; 

58 = 57-5 to 58.5 

58.0 = 57.95 to 58.05 

58.00 = 57.995 to 58.005 

(See TV, below, for a discussion of procedure followed in rounding odd 
or even terminal digits.) 

2. Calculation with Significant Numbers . 

Since data which purport to be actual measurements are only- 
approximations, it is obvious that data which have been estimated 
must be treated as approximations . A few practices commonly accepted 
in the handling of such approximations follow: 

a. Write only as many digits as are known to be correct 
(recognizing that the last digit vlll be an approximation), and add 
as many zeros as are necessary to locate the decimal point. 

b . Treat as significant all digits except zeros which are 
included to indicate the location of the decimeil po int . (Both 2,000 
and 0.000~contain one significant figureT) 

The significance of zero sometimes is difficult to deter- 
mine. In general, zeros are significant unless they occur 

(1) At the extreme left of a nxmiber 

(in the number 0.02 the zeros are 
not significant), or 

(2) At the extreme right of a mmiber 

and to the left of the decimal 
point . 

In the latter case, significance of the zeros must be determined from 
context or from the method of -writing the number. When terminal zeros 
appear in a column, it is fairly safe to assume that nimibers ending in 
zero are significant to the same place as other numbers in the column. 
In the following example, zeros to the right of the thousands column 
obviously are present only to locate the decimal point and are not 
significant. Zeros appearing in the thousands coluinn for 19^1 and 19^4 
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would appear to be significant. It would be safe to assume that values 
for all years have four significant fig\ires. 


Year 

$ Value 

1941 

5,500,000 

1942 

6,725,000 

1943 

7,805,000 

1944 

7,950,000 

1945 

8,891,000 


The significeuice of terminal zeros is less obvious where 
a number is not related so clearly to other numbers which do not end in 
zero. The number 100,000 stsinding by itself might be regarded as having 
only one significant figure, yet one, two, three, or more of the zeros 
could be significant. In such cases, the number of significant figures 
can be indicated in a footnote or by writing the number in what is 
known as standard, or scientific , notation. Standard notation should 
be used only when it cein be assianed that the reader will be familiar 
with such notation. When this cannot be assumed, standeird notation 
may be used to keep decimals and significance of figures straight in 
the ceG-culation stage but should be omitted from finished reports.* 




* standard notation is based 

on the principle that every number 

expressed as a multiple of some power of 10. 

An illustration of 

principle follows: 



15,625 

= 1.5625 X 10^ 


1,562.5 

= 1.5625 X log 


156.25 

= 1.5625 X lof 


15.625 

= 1.5625 X 10^ 

or 1.5625 X 10 

1.5625 

= 1.5625 X 10^ 

or 1,5625 X 1 

0.15625 

= 1.5625 X 10*^ 


0.015625 

= 1.5625 X lO'f 


0.0015625 

= 1.5625 X 10"3 



Positive exponents indicate the number of places the decimal point must 
be moved to the right. Negative exponents indicate the number of places 
the decimal point must be moved to the left. Inclusion of zeros in the 
niomber to be multiplied by a power of 10 will indicate the number of 
significant zeros in the number so expressed. Thus, in the case of the 
number 100,000 mentioned above, if only the 1 is significant, the 


_ 






\ 
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c. When adding or subtracting approximate numbers, rovmd the 
answer so that its last significant figure will fall in the same 
column as the last significant figure of the original number having 
its last significant figure farthest to the left,* To facilitate com- 
putation, figures may be rounded prior to addition or subtraction so 
that the answer will contain one (or two) more place(s) than ultimately 
will be retained. For example; 


Dollars 


360,000 

25,107,500 

25,320,000 


Thousand Dollars 

360 

25,108 

25,320 

50,788 


standard notation would be 1 x 105. If the number contains significant 
figures to the thousands column, it should be written 1.00 x 105. If 
there are four significant figures, the 100,000 should be written as 

1.000 X io5. 

* It should be noted that sometimes data are collected in such fashion 
as to make determination of the significance of figures impossible. In 
other cases, significance may be known, but strict adherence to the rules 
for calculation with significant numbers may result in a loss of informa- 
tion. For example, the distance from point A to point B may be known to 
be approximately 100,000 miles. The distance from B to C may be approxi- 
mately 25,000 miles and that from C to D approximately 15,000 miles. If 
rules for adding significant numbers are followed in adding the three 
together, the total distance from A to D would be reported as 100,000 
miles, implying a range of from 50,000 to 150,000 miles. However, it may 
be known that lU0,000 miles (the sum of the three approximate distances) 
is much closer to the true total distance than is 100,000. In such a 
case, use of ranges (see III, below) will prevent loss of information. 

By writing each of the measiirements as range numbers -- for example, 

50.000 to 150,000; 24,500 to 25,500; and 14,500 to 15,500 — it is 
possible to calculate an upper and lower limit for the total dis- 
teince — that is, 89,000 to 191,000 miles. A best estimate of the 
measurement shovild be made and the range of error indicated in real or 
percentage terms. In this instance the best estimate is l40,000 ± 

51,000, or, after rounding the result so that it carries only one 
more place them the least accurate figure, l40,000 + 50,000. 

(Another means of expressing this would be 140,000 + 36 percent.) 

Also, see the footnote on p. 9, below. 
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Since the first and last numbers are significant only to the ten-thou- 
sands column, the sum should he rounded to 50,790,000. 

d. When multiplying or dividing approximate numbers, rovind 
the answer so that it contains no more significant figures than 
did the original number having the fewest significant figures. To 
facilitate calculation, round off the number having the largest number 
of significant figures so that it carries one (or two) more significant 
figure(s) than does the number having the smeJ-lest number of significant 
figures. Two examples follow: 



1. Use Where Significant Numbers Lack Precision . 

Significant figures do not always permit precise expression of 
accuracy. They reflect accuracy of measurement, or error, to one-half 
of one unit of the last digit. For example, the highway distance (l4l 
miles) between Washington, D.C., and Philadelphia was measured to the 
nearest mile. This Information, written as l4l miles, implies that 
the true distance is between l40.5+ and l4l.5- miles. 

Where accuracy (or error) cannot be reflected adequately by 
significant numbers, and such accuracy is desired, a range must be used. 
Significant numbers always can be expressed as a range, but the converse 
is not always true. For example, it may be known that the means of 
measurement used in determining the distance between Washington, D.C., 
and Philadelphia are accurate only to within 2 miles in measuring 
distances of this magnitude. In this case, writing the result as l4l 
Implies greater accuracy than is warranted. The distance might be 
written as a significant number in the following ways: 

a. l4l or 140.5+ to l4l.5- 

b. l40 (l4 X 10^) or 135 to l45 
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Since the true distance lies anywhere from 139 to 1U3 miles (l4l plus 
or minus 2 ), expression of the distance as a significant number is 
misleading. None of the implied ranges exactly coincides with the 
true range of possibilities. The true range cannot be expressed as a 
significant number. In such cases, the number should be expressed as 
a range (for example, 139 to 1U3, l4l plus or minus 2, or l4l plus or 
minus 1.4 percent). Whenever possible, ranges should be stated in 
terms of a best estimate (that is, the figure within the range which 
is felt to be the best single representation of the measurement) and 
an estimated range of error stated in real or percentage terms. These 
values, the best estimate and the error term, should be expressed to 
a degree of accuracy compatible with the significance of the original 
data. 


Where it is not necessary to the understanding or use of the 
datum to have it expressed to the highest degree of accuracy, it is 
advisable to express it in rounded form. (This rounding shoxad be at 
least sharp enough to Include the true upper and lower limits. In 
the case of the distance between Washington, D.C., and Philadelphia 
this would mean l40 miles written as l4 x IQ-^.) 

2. Calculation with Range Numbers . 

a. Method 1 . 

One method of performing calculations with range numbers 

follows : 

(1) Addition: Add lower limit to lower limit, 

and upper limit to upper limit . 

(2) Subtraction: —Subtract the upper limit of 

the subtrahend from the lower limit of the 
minuend, and the lower limit of the subtra 
hend from the upper limit of the minuend? ^ 

(3) Multiplication: Multiply lower limit by 

lower limit, and upper limit by upper limit. 

(4) Division: Divide the lower limit of the 

dividend by the upper limit of the divisor, 
and the upper limit of the dividend by the 
lower limit of the divisor. 


1 

r' ^ 

') 

/ 
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The discussion of range numbers thus far has been limited to 
ranges written as a to b, for positive values only. The procedure is 
changed to the extent of inverting and sign changing when negative 
values apx>ear in the ranges. Since negative values are not likely to 
occur, their treatment has been omitted. V/hether the values eure posi- 
tive or negative, all that is required is trying the possibilities and 
choosing the minimum and maximum values obtained. The onitted proce- 
dure merely eliminates the necessity for trial and error where negative 
values occur. 

b. Method 2 . 

(l) Another means of calculation with ranges treats 
the measurement and the error in the form x + e, 
where x is a positive approximate number, or 
measxirement , and e is the error. Illustrations 
of this method follow*; 

(a) Addition; Add approximate nianber (x) to ap- 
proximate number, and error (e) to error: 

^1 i ei 
*2 i ®2 

(xi + X2) ± (ei + e2) 

(b) Subtraction: Subtract the subtrahend, and add 
the errors; 

xi ± ei 

«X 2 ± eg 

(xi - xg) ± (ei + eg) 


10 + 2 
-5 ±1 
5 ± 3 


10 + 2 
5±1 
15 + 3 


* ei and eg in the illustrations are in read terms. When the errors 
are expressed as percentages, they must be tremslated into real terms 
before calculation using Method 2. 
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(c) Multiplication: Pick extreme values of the 
possible products: 


xi +ei 

X2 

XiXg + (e^xg f egx^) + e^eg 


50 + /X2 X 5) 


10 ± 2 

$ ± 1 

+ (1 X loy + (2)(i) = 50 ! fa 


other results that may be obtained by multiplying (xi + ei) (x2 ± ea) lie 
between the desired minimum and maximum values. 

(d) Division: 

^1 - ®1 s ^ + ^2 ^1 + ^1 ^2 

X2 ± 62 X2 " (x2)2 

10+0.2 ^10 + (5 X 0.2) + (10 X 0.1) = 2 ± 0.08 or 2 ± 0.1 
5 + 0.1 ~ 5 " 25 


This form is approximate and should be used only when the relative value 
of e is small. 

(2) The discussion above has indicated how significant nimi- 
bers and range numbers reflect error in results. Whether the error 
is handled as part of the calculation or as a separate calculation is 
of little importance. What is important is that the error be reflected 
in the result.* 

* If the numbers used in demonstrating calculations with significant 
numbers are written as ranges and the computations redone in the manner 
indicated for range numbers, it will be discovered that the upper and 
lower limits so determined are, in most cases, outside the implied 
limits of the result expressed in significant nimibers. This difference 
usually is ignored and the result expressed as indicated in II; 2, above 
on significant nimibers. 
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cumulative effect of such errors can seriously 
' /^-affect an end result^ Assimie that it is estimated that at the end of 
19^5 the gold- exporting country of Ruritania had a gold inventory of 

25,000 i 5,000 kilograms; that annual production during the period 
19^6-50 was 10,000 + 2,000 kilograms; and that annual consumption, 
including trade, used 9^000 + 3^000 kilograms. The gold inventory at 
the end of 1950 may be detemined as follows; 

25,000 ± 5,000 
+ 10,000 ± 2,000 
+ 10,000 + 2,000 
+ 10,000 + 2,000 
+ 10,000 + 2,000 
+ 10.000 ± 2.000 
75,cxx) ±15,000 


- 9,000 + 3,000 
- 9,000 + 3,000 
- 9,000 + 3,000 
- 9,000 + 3,000 
-9.000 ± 3.000 

30,000 +30,000 


The cumulative effect of the error is to increase the possible range 
of error from 20 percent in 19^5 to 100 percent in 1950. Thus in 
absolute terms the gold inventory may be anywhere from zero to 60,000 
kilograms . 

The error need not be greatest in the total. Data 
for a total figure may be more accurate than for the components, and 
the coii 5 )onent figures, given euad estimated, may carry greater error 
terms. 

It should be emphasized that the best estimate may 
not be the midpoint of the range under consideration. In such circum- 
stances the error term will not be symmetrical and will be in the 

form 75 _ 28 • 
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3 . Confidence Intervals . 

Whenever possible, ranges of error shoiild be determined mathe- 
matically. However, such ranges should not include all possible values 
(100 percent probability). Inclusion of values at the extremes of the 
range usually will increase the width of the range greatly. The range 
of error should be one of 95 percent probability -- that is, sufficient- 
ly wide to include the true value 19 times out of 20. Put another way, 
the odds should be 19 to 1 that the true value fails within the range. 
More inclusive ranges rarely are desirable when dealing with imprecise 
data . 


When ranges of error must be determined subjectively, they still 
should be on a 95 percent probability basis. In spite of the difficulty 
of arriving at a 95 percent probability level subjectively, and in spite 
of the fact that it is, of necessity, approximated, a conscious attempt 
to limit the range of error in this fashion will eliminate portions of 
the range where values are less likely to fall. For example, a normal 
distribution theoretically will have values extending infinitely in 
either direction. A range of all values (lOO percent probability) for 
the distribution would have to extend from minus infinity to plus 
infinity, whereas on a 95 percent probability basis the range of values 
may be, for example, 45 to 55. The latter range indicates with a high 
degree of probability that the measurement in question lies within its 
limits and should be acceptable for most practical purposes. The former 
range, which includes an infinite number of values outside the limits of 
45 to 55, with only 1 chance in 20 of occurrence, is meaningless under 
most circumstances. _______ 

IV. Rounding . 

1. For the sake of accuracy, numbers should be rounded to eliminate 
all digits which are not significant. 

2. For the sake of the reader, even significant digits should be 
eliminated when they are unnecessary for precise comparison or clarity 
of presentation. 

3. If the figures to be rounded off amount to less than one -half of 
one of the units retained, the last digit retained will remain un- 
changed (for example, 425,499 would round to 425,000). 

4. If the figures to be rounded off amount to more than one -half 
of one of the units retained, the last digit retained will be increased 
by one (for example, 425,501 would round to 426,000). 
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5 . If the figvires to be rounded off amount to exactly one -half of 
one of the units retained, the last digit retained will remain un- 
changed if it is even and be increased by one if it is odd (for example, 
424,500 woxad round to 424,000, while 425,500 would round to 426,000). 
Treating zero as an even digit, there are 5 odd and 5 even digits. 

In an infinitely large number of cases this procedure will Increase the 
terminal digit half of the time and leave it unchanged half of the time. 
Hence its application will tend to reduce the chance of positive or 
negative bias in rounding. 

6. In calculation, use unrounded figures carrying one or two more 
digits than the number of significant figures which can be carried 
legitimately in the final answer, 

V . Totals . 


In presentation of rounded data, correctly rounded totals frequently 
are not exactly equal to the apparent sum of components. If it is 
feared that this will bother readers, totals may be footnoted to explain 
that the discrepancy is due to rounding. Another technique sometimes 
used is to force the total by adjusting the figures. This technique 
introduces inaccuracies which would be avoided by simply footnoting 
discrepancies. If it is used, the component figures, not the total, 
should be adjusted, and care must be taken not to misrepresent facts 
significantly. In forcing totals, adjustments should be made so that 
percentage changes in Individual figures will be kept to a minimum. 


VI. Index Numbers. 



Index numbers are a useful device when '^e wishes to focus attention 
on relative changes in siM^ quantity or quantities without considering 
the absolute amounts ofP^the quantity or the changes. Since index 
numbers usually are in percentage terms, they also are^ useful for com- 
paring changes originally measured in none omparabie units (for example, 
physical and value terms). 

1. Simple Relatives . 

One of the simplest and most frequently used indexes is the 
simple relative, which is nothing more than the expression of each 
datum in a series as a percentage of some other datum in the series 
which has been chosen as a base and equated to 100 percent. This 
may be illustrated as follows: 
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Year 

US Midyear Population 
(Thousand) 

Index 

(1937 = : 

1937 

128,961 

100 

1951 

154,360 

120 

1952 

156,981 

122 

1953 

159,696 

124 


The most common type of index is one that measures relative changes 
occurring over time. Consequently, the base selected usually is a 
measiirement for some time period. The base selected will depend on 
what the index is to illustrate. 

Simple relatives are sometimes expressed as percentages of 
the preceding year. For example, 19^6 may be expressed as 85 per- 
cent of 19^0; 19^7 as 105 percent of 19^6, and 19^8 as 101 percent 
of 19^7- Such relatives axe known as link index numbers . Link index 
numbers may be converted to a common base, or chain index , by setting 
the base year equeG. to 100 and obtaining successive values by multi- 
plying the link index number for each year by the chain index number 
for the preceding year. A chain index can be constructed from the 
link index numbers given in the example above as follows: 

1940 = 100 j 1946 = 85 (or 100 x 85): 1947 = 89 (or 85 x 105 ); 

1948 = 90 (or 89 X 101). (See the following tabulation.*) 

Sometimes it is desirable to shift t h e ba se of suciT^elatives . 
There can be a number of reasons for doing so -- for example, desire 
to compare indexes originally on different bases, to focus 

attention on comparison of data with that for some date of special 
interest, or to splice overlapping indexes together. Shifting 

relatives from one base to another is done by setting the index value 
of the new base period (for example, a year) equal to 100 and deter- 
mining the other values on the new base by dividing each old index 
value by the old value of the new base eind multiplying by 100.** (See 
the following tabulation.*) 


* P. 14, below. 

** Such shifting cannot be done indiscriminately. Some Indexes 
require recomputation if the base is to be shifted. 
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Year 

Link Index 
Number 

Multiply by 

Chain Index 
Ntunber 

(1940 = 100) 

Chain Index 
Number 

(1950 = 100) 

1940 

100 


100 

109 

1946 

85 

100 

■1 85 

92 

054? 

105 

85 

^ 89 

97 

1948 

101 

89 

90 

98 

1949 

104 

90 

R 94 

102 

1950 

98 

94 

792' 

100 

1951 

99 

92 

91 

99 

1952 

100 

91 

91 

99 

1953 

104 

91 

95 

103 




In csQ-ciolating pe rcentage Increases a nd decreases, a frequent 
error is failure to subtract 100, resulting in a figure expressed in 
terms of the base rather than as an increase over or a decrease from 
the base. Such calculations can be performed properly by dividing 
the base figure into the other, multiplying the quotient by 100, and 
subtracting 100. The resulting difference indicates the increase or 


decrease with appropriate sign; 


for example, if the base « UOO, 


C 


100 = + 100 percent (increase) 




’0 


0 


'liOO 


X 100 ) - 100 *s - 25 percent (decrease) 


2. Aggregate and Weighted Indexes . 

Frequently, expression of single values as percentages of a 
base is inadequate, and indexes become more complex, entailing aggre- 
gation and/or d " 

With^only a l i mited number of SStles of data, and knowledge 
of their appropriate weights, it is possible to reflect the activity 
of an entire economy. For example, by utilizing production data for 
four ferroalloying metals, representing 45 percent of Ruritanla’s total 
ferroalloying metals production, it is possible to construct an index 
reflecting production of all ferroalloys. The ferroalloying metals 
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index could then be used, together with similar production indexes for 
other types of metals, as a component index in constructing an all- 
metals index, which, in turn, could be used to construct an economy- 
wide index. 


a. Simple Aggregates . 

Simple aggregation requires only that the summation of 
data for one period be expressed as a percentage of a similar sum- 
mation for the chosen base period. Where data are all in the same 
units, they may be added directly. Where not in the same units, data 
can be expressed as relatives (each datum as a percentage of the same 
datum in the base period), be added, and then be expressed as a per- 
centage of the base period. Illustrations are given in Tables 1 
and 2.* 


Table 1 

Production of Selected Ferroalloying Metals in Ruritania 

1946-52 


Thousand Metric Tons 



1946 

1947 

1948 

1949 

1950 

1951 

1952 

Manganese 

10.25 

10.50 

10.75 

10.25 

10.75 

10.90 

11.00 

Molybdenimi 

0.48 

0.51 

0.55 

0.55 

0.58 

0.58 

0.58 

Chromite 

0.90 

0.90 

0.85 

0.80 

0.85 

0.88 

0.90 

Tungsten 

0.12 

0.15 

0.15 

0.15 

0.15 

0.15 

0.20 

Total 

11.75 

12.06 

12.30 

11.75 

12.33 

12.51 

12.68 

Index 

(1948 = 100) 96 

98 

100 

96 

100 

102 

103 


* Table 2 follows on p. 16. 
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Table 2 


Indexes of Production of Selected Ferroalloylng Metals in Ruritania 

19U6-52 


19^8 = 100 



1946 

1947 

1948 

1949 

1950 

1951 

1952 

Manganese 

95.3 

97.7 

100.0 

95.3 

100.0 

101.4 

102.3 

Molybdenum 

87.3 

92.7 

100.0 

100.0 

105.5 

105.5 

105.5 

Chromite 

105.9 

105.9 

100.0 

94.1 

100.0 

103.5 

105.9 

Tungsten 

80.0 

100.0 

100.0 

100.0 

100.0 

100.0 

133.3 

Total 

368.5 

396.3 

400.0 

389.4 

405.5 

410.4 

447.0 

Index 

92 

99 

100 

97 

101 

103 

112 


Table 1 illustrates the aggregation of actual values 
(where the data are all in the same units) and demonstrates the domina- 
tion of the index by the highest weight. Table 2, where the data are 
expressed in relatives, eliminates this domination by a single item.* 
Use of relatives has the added advantage of reducing an data to \mit- 
less terms, hence making possible the addition of data that originally 
are not in compeirable, or like, units. In both instances, however, the 
relative importance of the four metads (chosen as being representative 
of the entire population of ferroalloying metals) has not been given any 
consideration. In Table 1, manganese dominates the index merely because 
it has the greatest weight of production and not because of any measure 
of the relative value of its production to the economy. In Table 2 the 
four components have been assigned equal weights — that is, all are 
treated as being of equal importance.** 


* It should be noted that an arithmetic mean was used in arriving at 
the annual index in this case. Other measures of central tendency could 
have been used. Each has its advantages and disadvantages. 

** Indexes such as those in Tables 1 and 2, where, without recourse to 
additional information, all components are treated as being of equal 
iii:5)ortance, usually are called unweighted, or simple aggregate, indexes. 
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b. Weighted Aggregates . 

Frequently data must be weighted to avoid unreasonable 
domination of the index by certain of the components and to insure 
that all components exert an influence proportionate to their relative 
importance to the economy. Suppose that ferroalloying metals were 
priced (in Ruritanian macropoimds ) as follows: 

Price per Metric Ton 
Metal (1947-^9 Average) 

Manganese O.O5 

Molybdenum 10.30 

Chromite O.3O 

Tungsten 12.10 


If these prices are used as weights in constructing an index from the 
data contained in Table 1^ the resulting index will better reflect 
each metal's share (of the vadue) of production. In Table 3 the pro- 
duction data from Table 1 have been multiplied by the price data given 
above . 

Table 3 

Production of Selected Ferroalloying Metals in Ruritania 
(Weighted by 1947-^9 Average Prices) 

1946-52 


Thousand Macropounds 



1946 

1947 

1948 

1949 

1^50 

1951 

1952 

Manganese 

0.51 

0.52 

0.54 

0.51 

0.54 

0.54 

0.55 

Molybdenum 

4.94 

5.25 

5.66 

5.66 

5.97 

5.97 

5.97 

Chrcanite 

0.27 

0.27 

0.26 

0.24 

0.26 

0.26 

0.27 

Tungsten 

1.45 

1.82 

1.82 

1.82 

1.82 

1.82 

2.42 

Total 

7.17 

7.86 

8.28 

8.23 

8.59 

8.59 

9.21 

Index 

(1948 = 100) 87 

95 

100 

99 

104 

104 

111 
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The production index from Table 3 indicates a sharper rate 
of growth than do the indexes from Tables 1 and 2. The weights have 
given the data a new perspective, and the dominating component of the 
index is now molybdenum. The weighted index numbers reflect the pro- 
duction of ferroalloying metads more accurately than do the indexes 
from Tables 1 aAd 2, since the weights introduce the economy's rela- 
tive evaluation of the different metals. 


As the weighted index is continued for several more years, 
it is possible that it will reflect production less realisticeilly . 
Changes in demand, prices, or other factors may make the welght,s 
derived from I 9 U 7-49 data no longer suitable. This sort of bias 
becomes more likely as time extends farther and farther from the 
base, and it may become necessary to change both the base period 
and the system of weights. 


There are other types of indexes that might be constructed. 
Different weights, weights that vary from year to year, and weighted 
relatives are but three of many possibilities. These other indexes 
adso would have their biases and would differ to seme extent in what 
they measure. Care must be used in selecting the index to be used 
so that it will best reflect what is to be indicated. The data used 
must be appropriate and must be combined in the most suitable manner. 
The base chosen should minimize bias. Consideration should be given 
to the tendency of bias to increase as the span of time moves farther 
frem the base. The brevity of the present research aid precludes more 
than hinting at many problems which arise in working with index num- 
bers. The reader is \irged, therefore, to consult detailed literature 
on index numbers for further information. 


VII. Computing Rates of Increase or Decrease. 


Sometimes it is desirable to compute the average rate of increase 
or decrease during a specified time period. A common error committed 
in computing such average rates of increase is the simple averaging 
of the increases from period to period. This procediire does not take 
into account the compoxind nature of the problem. Average rates of 
Increase should be computed as follows: 
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\ 

- = (1 + where A = final amount, B = base or initial amount, 

B 

r = rate of increase for each time period, and n = number of time 
periods of increase or decrease.* Suppose that the price index of a 
commodity is as follows: « 


Year 

Index 

1950 

100 

1951 

100 

1952 

280 

1953 

300 

= 3 . 

Hence |22 = (1 + r)^. 


Then A = 300, B = 100, n = 3 

that the only values necessary are those for the base period, the final 
period, and the span of time.) This equation may be solved by dividing 

the logarithm of — (3) by n (3) and subtracting 1 from the antilog: 

B 


lo^ = = 0.159040; the antilog of 0. 159040 = 1.442 = (1 + r); 

n 3 

hence r = 0.44. The annual rate of increase from 1950 through 1953 is 
44 percent per year. This indication of the trend is quite different 
from the result of 62 percent which would be obtained by averaging -the 

annual increases as follows: 0 + l 80 + 7 52 . 


VIII. Tabular Presentation . 

A table should be a completely self-explanatory unit, whether it 
occupies part of a page, a whole page, or several pages. Tables in- 
troduced into the body of a report should, however, be introduced at 
a logical point, and should be referred to (by number) at that point, 
in the text. Accompanying textual commentary should highlight impor- 
tant portions of the table and emphasize conclusions based on it . 

* This is the ccanpound interest formiala usually expressed as 

Final Amount _ ^ ^^n^ Final Amoixnt = Principal + Interest. 

Principal 
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Commentary should not merely repeat data presented in the table. The 
following comments on tabular presentation should be considered in 
conjunction with Table 4.* 

1. Numbering . 

a. Tables should be numbered consecutively, in order of 
physical location. VIhere tables appear in appendixes, the first ap- 
pendix table shoxild bear the number following that of the last table 
in the text. 


b. The table niomber shoiild be above the title and centered 
on the page. 


c. Tables shovild be referred to by number in textual ccm- 
mentary. (Where textual references are widely separated from the 
table, page numbers shotild be given in footnotes.) 

d. Small tabulations within the text, which are not set 
up as tables , need not be numbered . 

2. Title . 

a. The title of a table shoxild be in topic form, briefly 
indicating what , where , and when, in order of importance . For 
example, if a report deals with the harmonica industry in East 
Germany, a table emphasizing production might be headed "Production 
of Harmonicas in East Germany, 1930-52." If, on the other hand, 
the report is on musiceil instrument production in East Germany, 
a table on production of harmonicas might be headed "ffitrmonica 
Production in East Germany, 1930-52." If the place is to be em- 
phasized, in a report on world production of harmonicas, the table 
might be headed "East German Production of Harmohicas, 1930-52." 
Whichever order, or emphasis, is chosen, it should be used con- 
sistently throughout a single report, or, in some instances, through- 
out a section of a report having a specific emphasis. 

* Table 4 follows on p. 21. Note that this type of footnote shoiad 
be inserted when the table does not follow on the same page as the 
main reference thereto (for an additional example, see p. I5, above). 
All other footnote references to tables should be by page number, with 
"above" or "below" added, as the case may be (for example, see p. 25, 
below) . 
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Table h 


Production and Cost of Turtle Food in Selected Counties of Ruritania ^ 

[title 1 [prefatory note! (Excluding Cottage Production) 

.. 1938, 19^6-53 

[caption] ( [units] 

bs / \ \ 


Year 


1938 

1946 

19^7 

1948 

1949 

1950 

- Total, 1946-50 

1951 h/ 

1952 

1953 k/ 

Total, 1946-53 
Average, 1946-53 


y Y Z Tdtal \ 


\ \ 
Production 

'N.Cost c/ 

Production 

Cost 

Production 

Cost 

Product^n 

Cost 

(Pounds) 

(Dollars) 

(Pounds) 

(Dollars) 

(Pounds) 

(Dollars) 

(Pounds) 

(Dollars 

400 

82 

120 ^ 

22 

Negligible 

0.1 ^ 

520 

100, 

210 

60 

30 

9.5 

Negligible 

0.1 

240 

70 

260 

100 

35 

12 

2.2 

0.9 

300 ^ 

110 

300 

120 

55 

19 

4.0 

1.9 

360 

140 

300 s/ 

130 

TO 

25 

2.0 

1.4 

370 

160 

320 

150 

90 

35 

Negligible 

0.2 

410 

190 

1,400 

560 

280 

100 

8.2 

4.5 

1^700 

660 

350 

160 

100 

43 

3.3 

3.0 

450 

210 

400 

200 1/ 

120 

52 ^ 

4.1 

3.9 

520 

260 

410 

210 

130 

58 

4.2 

3-9 

540 

270 

2,600 

1,100 

630 

250 


15 

3,200 

1,400 

320 

140 

J2 

32 

2.5 

ki 

40O 

170. 


a! Except vhere indicated otherwise, all data contained in Table 4 are from source lOST ^ All data are rounded to two 
significant figures. Totals and averages are derived from unrounded figures and do not always agree with rounded data 
shown. 

b. As of 1 July. 

c. Except where Indicated otherwise, cost data for X are from source 104/ . 

105/ 

e . 

f . 
g- 

h. 

i. 

J. 

k. Data for 1953 are from the following sources: County X, 106/ j Coxinty Y, 107/ ; County Z, 108/ . 
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b. Dates showing the period covered should be placed on a 
separate line beneath the descriptive part of the title. They should 
be presented as follows: 19^8, to Indicate a single year; May 19^, 
to indicate a single month; 1938-37 (but l899“1900), to indicate two 
consecutive years; 1938-40 (but 1895-1910), to indicate more than two 
consecutive years; Selected Years, 1940-50, to indicate scattered years 
within a period; 1895, 1900, and 1940-48, to indicate widely scattered 
years. Fiscal years, crop years, trade years, or averages of consec- 
utive years should be presented as follows: 1945/46, not 1945-46. The 
type of year used, if other than a calendar year, should be indicated in 
the title of the table or in a footnote. 

c. Titles shoxild appear in initial capitals, should not be 
underscored, and should not be followed by a period. 

d. When a table covers more than one page, the word "Continued" 

in parentheses should appear under the title on all pages except the first. 

3^* -j Prefatory I^te . 

V / 

A prefect©*^ note may be placed directly beneath the descriptive 
part of the title and before the date(s) for the piurpose of clarifying 
or limiting the title, provided this explanation can be given in a 
brief phrase. Such brief notes should be of a general nature, applying 
to all or most of the table. Longer explanations or explanations of 
specific items in the table always should be given in footnotes. 

4. Spacing . 

Double -space between the title (including the word "Continued" 
in parentheses, when used) and the beginning of the table. The unit 
of measi^ement, placed as indicated under Units (see 5, a, below), 
would mark the beginning of the table for this purpose. 

5. Units . 

a. Where the unit of measvirement is the same throughout the 
table , it should be placed at the extreme right on the solid line 
marking the beginning of the table. 

b. If there is more than one luiit of meas\irement and the data 
are arranged in columns, units should be indicated in parentheses in the 
caption, below column headings. Where data are arranged in rows, a 
units column may be used to designate the unit of measurement for each 
row. Units in a units column are not to be enclosed in parentheses. 
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6. Caption (Column Headings) . 

a. Colxaran headings should he brief and in the singular. 

b. Comparable column headings should be consistent — for 
example, value is comparable to quantity, not tons. 

c. Units of measurement should appear in parentheses \mder 
the appropriate heading (see 5> b, above). 

d. Underscoring of column headings or subheadings should 
extend to the limits of all columns under the heading. 

7. Stub (Rov Headings) . 

a. Items in the stub (side entries) shoxild be listed in the 
order best suited to the data — for example, according to importance 
or geographically, alphabetically, and so forth. 

b. If a second line is required for a stub entry, the second 
line shoilLd be indented, and related column entries should be placed 
opposite the bottom line of the stub entry . 

c. Subheadings should be double-spaced below main headings 
and indented two spaces. Items subordinate to subheadings should be 
double- spaced below the subheadings and indented two spaces. 

d. Totals and averages should be double-spaced below the 
entry which they follow, and the designation should be indented two 
spaces . 

e. Entries in the stub should not be underscored. 

8. Body . 

a. Ciphers (O) should be entered where data have values of 

zero. 

b. Where data have values equal to or less than one -half of 
the minimum unit being carried, the word "Negligible" should be entered. 
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c. Where data ajre not available^ hut the phenomenon which 
the data wovild represent is known to exist, enter "N.A." in the appro- 
priate column. 


d. Where data are not available and it is not known whether 
any item or activity exists to be represented, enter "Unknown." 

^J^yphens, a^d dashes should never be used 

err* 

f . In figures of four or more digits the comma should 



in a tai 


be used. 


g. Totals and averages shovild be underscored with a single 
line, grand totals and averages with a double line. 

9* Footnotes . 


a. Lower-case letters sho\ild be used as references to table 
footnotes. 

b. In making footnote references, esuih line should be con- 
sidered in its order, with more than one reference on a given line 
lettered consecutively from left to right . 

c. Footnote references should be placed after an entry. When 
there is no entry, the footnote reference shotild be placed in the 
position of the entry. 


d. When a table is more than one page long, footnote entries 
should appear at the end of the table. In such cases the first foot- 
note reference should be followed by an asterisk ( 5 /*), and the asterisked 
reference at the bottem of the first page should read, for example, as 
follows: "Footnotes to Table 3 follow on p. 6." 

10. Sources. 


^ Nimierical source references should be kept out of the body of 

'the table. References, numbered in the same nimierical sequence used 
in the text, should follow title, caption (column headings), or row 
(headings. Where use of multiple sources necessitates seme clarifying 
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statement in addition to simple source references, lower-case letters 
may be substituted for reference numbers in the title, caption (column 
headings), or row headings, or, for individual items, in the body of 
the table with the reference number appearing in the footnote below 
(see Table k*) . The Sources appendix should carry source references 
for each entry. 

11. Small Tabulations within the Text . 

a. Small tabiaations within the text need not have the 
format of a table and may be considered part of the text.** 

b. Such tabulations always are introduced verbally, usually 
with a statement such as "Production of these items- during the 2- 
year period was as follows:" 

c. Footnotes to small tabulations within the text should be 
considered as text footnotes and treated accordingly. 

IX. Graphic Presentation . 

1 . General Notes . 

a. Graphic presentation is intended to reveal relationships 
of data at a glance. Graphs, or charts, should be completely self- 
explanatory units. Graphs often are numbered serially (as Figure 1, 
Figure 2, and so on), and whether numbered serially or not, they 
always should be introduced and referenced in the text.*** 

b. The title shoiild indicate what, where, and when, in order 
of importance . 

c . No source reference need be given on the graph itself . 
Source references should be given immediately below the graph or fol- 
lowing any title or footnotes below the graph. 

d. The vertical axis (ordinate) should indicate units of the 
dependent variable . 

* P. 21, above. 

** See pp. 3, 5, 13, and so on, above. 

*** See pp. 26, 27, and 28, below. Note that graphs will not be given 
page numbers. When they follow the last page of the report, they may 
be referenced as "inside back cover." 
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e* The horizontal axis (abscissa) should indicate units of the 
independent variable. Frequently, time is represented on the horizontal 
axis. In such cases the value plotted against time (where the unit of 
time is a period — for example, year or month — rather than a specific 
point in time) should be plotted at the midpoint of the particular time 
interval . 


f. The legend should indicate what each curve, bar, and so 
forth, on the graph represents. 


g. Footnotes should appear below the graph, on the left. When 
the graph is based on data set forth elsewhere in tables or text, it is 
not necessary to repeat either source references or explanatory notes. 
Reference to the table, or to the page on which the data originally are 
explained, is adequate. For example: "Data are from Table 1, p. l6, 
above" (or, as the case may be, "Data are from p. l6, above"). 

h. The zero point always should be indicated on arithmetic 
scale graphs. Where space limitations make inclusion of the complete 
scale, starting with zero, inconvenient, a break is used to indicate 
omission of part of the scale. 

2. Arithmetic Graphs (Figure 1*). 

a. Arithmetic graphs -- that is, those plotted on arithmetic 
scales — are best for examination of series in real terms. From 
Figure 1 it may be seen readily that the values of curve A are at 
least 10 times as large as those of curve B. 


b. If relative changes are to be examined, the arithmetic 
graph may lead to erroneous conclusions. For example, the fluctua- 
tions in curve A appear more violent than those in curve B. However, 
the fluctuation of curve A from I936 to I937 is 100 percent 

20 * of curve B for the same period UOO percent 

( j — ~ ^ 100) . Because the purpose of graphics is to make relationships 


of principal importance instantly apparent to the reader, other forms of 
graphs are more suitable for Indicating relative change. 

3* Semilogarlthmic Graphs (Figure 2*). 


a. The semilogarlthmic graph is a common method of charting 
relative change. Since equal distances on a logarithmic scale represent 


* Following p. 28. 
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equal ratios, or percentage changes (as contrasted with equal amounts on 
arithmetic scales), the reader can ascertain relative change easily. For 
example, in Figure 2 relative growth between 1936 and 19^0 obviously is 
greater for curve B than for curve A, since the vertical distance between 
values is greater for curve B than for curve A. This was not apparent 
when the same data were plotted on an arithmetic scale in Figvtre 1. 

b. In logarithmic graphs, rega/3le«^j3f '■the scales usedj ^ 

equal distances always will mean equal WtlTo^ (This makes possible!^ 0^ 
the use of two or more scales on a single'nK^t to facilitate com- VT 
parison on a relative basis.) Because of this property, the semi- 
logarithmic graph is useful in illustrating ratios, comparison of 
data in different \uilts, and examination of relative growths and 
differences . 


c. A logarithmic scale never starts at zero, because each 
phase or cycle of the scale has values 10 times those of the cor- 
responding values of the preceding cycle. If the scale were started 
at zero, each succeeding cycle also would start at zero (10 x O). 
Although the initial value of the first cycle can be any value other 
than zero, proper selection of an initial value prevents having to 
use a scale with awkward units. 

4. Graphs of Index Numbers (Figure 3*)' 

a. Another means frequently used to illustrate relative 
change is the graphing of index nimibers. Where index numbers are 
expressed as percentages of a base (original units being -lost in 
computation of the index numbers), the graphing of series of indexes 
makes possible quick comparison of relative changes in terms of the 
bases. (Equal distance reflects equal percentage change.) From 
Figiire 3 it can be seen that for any given time period, the curve 
farthest from the base line of 100 has experienced the greatest 
change relative to the base. 

b. Graphs of index numbers also make possible easy compari- 
son of series measured in different terms and series that are con- 
siderably different quantitatively. 


* Following p. 28. 
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5. Bar Charts (Figure 4 *). 

a. Bar charts may be u«ed to plot frequency distributions 
(as are frequency polygons, or line charts, of the type described 
above). Bar charts are best suited for comparing data for a few 
years only, line charts being preferable when the number of items 
being compared and/or the independent variable (for example, time 
period) assumes more than a few values. 

b. Bars should be of equal width, length being the only 

variant . 

6 . Pie Charts (Figure 5 *) • 

Pie charts frequently are used to give a quick, rough com- 
parison of relative sizes. Approximate percentages of the sections 
should be indicated on the diagram. 

7 - Pictorial Diagrams (Figure 6 *). 

Pictorial diagrams are a device for attracting attention. 

They are a rough means of presentation and may take almost any form. 
It usually is preferable to use items of constant size, varying the 
numbers, rather than to use items of different sizes to reflect 
different values. The latter approach leaves too much Interpretation 
to the reader. 


* Following p. 28. 
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Figure 1 

RURITANIA 

PRODUCTION OF A AND B* 

1936-40 



* 


Excludes cottage production 
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Figure 2 

RURITANIA 

PRODUaiON OF A AND B* 

1936-40 
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Figure 3 

RURITANIA 

INDEXES OF PRODUaiON OF A AND B* 

1936-40 


1936 = 100 



1936 1937 1938 1939 1940 


* Excludes cottage production 
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RURITANIA 

PRODUCTION OF C AND D, 1935-37 

I — 1— ^ ^ ^ 

r I ^ — I I , — . □ D 



AREA OF RURITANIA, BY USE 

1954 


Residential 

25 % 


Farming 

30 % 


Forestry 

35 % 


POPULATION OF RURITANIA, BY SEX 

1954 

(Each figure represents one million Ruritanians) 
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Figure 6 
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